-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for multimodal openai - early version #313
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(1) image output potentially (2) test generator (3) test using real api by yourself
# For image generation, input is the prompt | ||
final_model_kwargs["prompt"] = input | ||
# Set defaults for DALL-E 3 if not specified | ||
if "model" not in final_model_kwargs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"model" must be set. We dont have to do default, because in the future, a model can be deprecated.
if "size" not in final_model_kwargs: | ||
final_model_kwargs["size"] = "1024x1024" | ||
if "quality" not in final_model_kwargs: | ||
final_model_kwargs["quality"] = "standard" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of using so many if not, use final_model_kwargs["quality"] = final_model_kwargs.get("quality", "standard")
final_model_kwargs["response_format"] = "url" | ||
|
||
# Handle image edits and variations | ||
if "image" in final_model_kwargs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these code is kind of ugly, might want to avoid so many embeded if
overall it looks great @fm1320 |
@fm1320 i need to see the testing of the generator using openai client, did you test it and add examples in the generator rst and ipynb |
@@ -2043,6 +2043,272 @@ | |||
"build_custom_model_client()" | |||
] | |||
}, | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the examples you add fit more into the generator.ipynb, please move there. The modelclient layer is not really user facing layer but more for contributors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing with new changes now
@@ -106,6 +106,161 @@ In particular, we created :class:`GeneratorOutput<core.types.GeneratorOutput>` t | |||
Whether to do further processing or terminate the pipeline whenever an error occurs is up to the user from here on. | |||
|
|||
|
|||
Basic Generator Tutorial |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add this in the genearator colab. They all go together. i can approve now, but you would need that for easy sharing
The PR adds multimodal (text + image) support to the existing OpenAI client while maintaining backward compatibility with text-only operations. I also adds image generation with Dall E 2 and 3. It also adds tests and updates docstring
Add DALL-E Image Generation Support
Key Changes
IMAGE_GENERATION
model typeExample use:
Text only:
Multimodal:
Image generation:
TODO:
Fixes #<issue_number>
Before submitting